47 research outputs found

    On Deep Multiscale Recurrent Neural Networks

    Full text link
    L’apprentissage profond a poussé l’étude des réseaux de neurones profonds et a conduit à des avancées significatives dans plusieurs domaines d’application de l’apprentissage automatique. Dans ce manuscrit, nous nous concentrons sur un sous- ensemble de ces modèles: les réseaux de neurones récurrents. La structure spécifique de ces réseaux fait de la modélisation de données temporelles, telles que les données textuelles ou de parole, leur point fort. Plusieurs domaines d’applications plus pratiques en font d’ailleurs leur composante essentielle, incluant la reconnaissance de parole, la synthèse de parole, la traduction automatique et l’apprentissage par renforcement. Cependent, notre compréhension des réseaux de neurones récurrents reste loin d’être complète, plusieurs problèmes spécifiques aux réseaux de neurones n’ont pas encore été r ésolus. Ce manuscrit inclut plusieurs pistes conduisant à des architectures de réseaux de neurones récurrents profond et multiéchelle. Dans un premier article, nous présentons un réseau récurrent pouvant contrôler son propre schéma de connectivité entre couches représentant des indices temporels consécutifs. Ces connexions entre temps consécutifs ne se limitent pas juste à des connexions sur un même niveau mais permettent à des couches de haut niveau de communiquer avec des couches plus basses, et vice-versa. Un ensemble d’unités barrage paramétriques est appris afin d’ouvrir ou de fermer les connexions qui conduisent le signal des couches précédentes temporellement. Nous étudions comment les informations des couches ascendantes sont utiles dans la modélisation de dépendences temporelles. Dans un deuxième article, nous étudions un système de traduction automatique neuronale reposant sur un décodeur par caractère. Ce travail est motivé par une question fondamentale: peut-on générer une suite de caractères en guise de traduc- tion au lieu d’une suite de mots? Afin de répondre à cette question, nous avons utilisé une architecture simple à deux niveaux et conçu un réseau de neurones plus complexe traitant les dynamiques rapides et lentes séparemment. Ce nouveau mo- dèle se base sur l’idée d’utiliser des composantes évoluants à différentes échelles afin de traiter les dépendences temporelles. Nous étudions dans un troisième article une architecture de réseau récurrent permettant la découverte des structures latentes d’une séquence. Cette nouvelle architecture s’appuie sur un ensemble d’unités limites permettant une segmentation en morceaux pertinents. Le réseau de neurones récurrent met à jour chaque couche cachée sur un rythme différent dépendant de l’état de ces unités limites. L’inclusion de ces unités limites nous permet de définir un nouveau mécanisme de miseàjour utilisant trois différents types d’opérations: chaque couche peut soit copier l’état précédent, mettre à jour cet état ou évacuer cet état vers l’état de plus haut niveau et réinitialiser le contexte. Enfin, un quatrième article se penche sur l’utilisation de variables latentes dans un réseau de neurones récurrent. La complexité et le rapport signal-bruit de données séquentielles comme la parole rendent la découverte de structures pertinentes dans ces données difficiles. Nous proposons une extension récurrente de l’auto-encodeur variationel afin d’introduire ces variables latentes et améliorer la performance dans la modélisation séquentielle, incluant celle de la parole et de l’écriture manuscrite.Deep learning is a study of deep artificial neural networks that has led to several breakthroughs in many machine learning applications. In this thesis, a subgroup of deep learning models, known as recurrent neural networks is studied in depth. Recurrent neural networks are special types of artificial neural networks that possess more strength in modelling temporal structures of sequential data such as text and speech. Recurrent neural networks are used as the core module of many practical applications including speech recognition, text-to-speech, machine translation, machine comprehension, and question and answering. However, our understanding of recurrent neural networks is still limited, and some inherent problems with recurrent neural networks remain unresolved. This thesis includes a series of studies towards deep multiscale recurrent neural networks and novel architectures to overcome the inherent problems of recurrent neural networks. In the first article, we introduce a deep recurrent neural network that can adaptively control the connectivity patterns between layers at consecutive time steps. The recurrent connections between time steps are not only restricted to self-connections as the conventional recurrent neural networks do, but a higher-level layer can connect to the lower-level layers, and vice-versa. A set of parametrized scalar gating units is learned in order to open or close the connections that carry the feedback from the layers at the previous time step. We investigate how the top-down information can be useful for modelling temporal dependencies. In the second article, we study a neural machine translation system that exploits a character-level decoder. The motivation behind this work is to answer a fundamental question: can we generate a character sequence as translation instead of a sequence of words? In order to answer this question, we design a naive two-level recurrent neural network and a more advanced type of recurrent neural network that tries to capture faster and slower components separately with its layers. This proposed model is based on an idea of modelling time dependencies with multiple components that update with different timescales. In the third article, we investigate a framework that can discover the latent hierarchical structure in sequences with recurrent neural networks. The proposed framework introduces a set of boundary detecting units that are used to detect terminations of meaningful chunks. The recurrent neural network updates each hidden layer with different timescales based on the binary states of these boundary detecting units. The inclusion of the boundary detectors enables us to implement a novel update mechanism using three types of different operations. Each layer of the recurrent neural network can choose either to completely copy the dynamic state, to update the state or to flush the state to the upper-level layer and reset the context. Finally, in the fourth article, we study an inclusion of latent variables to recurrent neural networks. The complexity and high signal-to-noise ratio of sequential data such as speech make it difficult to learn meaningful structures from the data. We propose a recurrent extension of the variational auto-encoder in order to introduce high-level latent variables to recurrent neural networks and show performance improvements on sequences modelling tasks such as human speech signals and handwriting examples

    Unveiling the potential of Graph Neural Networks for network modeling and optimization in SDN

    Full text link
    Network modeling is a critical component for building self-driving Software-Defined Networks, particularly to find optimal routing schemes that meet the goals set by administrators. However, existing modeling techniques do not meet the requirements to provide accurate estimations of relevant performance metrics such as delay and jitter. In this paper we propose a novel Graph Neural Network (GNN) model able to understand the complex relationship between topology, routing and input traffic to produce accurate estimates of the per-source/destination pair mean delay and jitter. GNN are tailored to learn and model information structured as graphs and as a result, our model is able to generalize over arbitrary topologies, routing schemes and variable traffic intensity. In the paper we show that our model provides accurate estimates of delay and jitter (worst case R2=0.86R^2=0.86) when testing against topologies, routing and traffic not seen during training. In addition, we present the potential of the model for network operation by presenting several use-cases that show its effective use in per-source/destination pair delay/jitter routing optimization and its generalization capabilities by reasoning in topologies and routing schemes not seen during training.Comment: 12 page

    Multi-instance learning for bipolar disorder diagnosis using weakly labelled speech data

    Get PDF
    While deep learning is undoubtedly the predominant learning technique across speech processing, it is still not widely used in health-based applications. The corpora available for health-style recognition problems are often small, both concerning the total amount of data available and the number of individuals present. The Bipolar Disorder corpus, used in the 2018 Audio/Visual Emotion Challenge, contains only 218 audio samples from 46 individuals. Herein, we present a multi-instance learning framework aimed at constructing more reliable deep learning-based models in such conditions. First, we segment the speech files into multiple chunks. However, the problem is that each of the individual chunks is weakly labelled, as they are annotated with the label of the corresponding speech file, but may not be indicative of that label. We then train the deep learning-based (ensemble) multi-instance learning model, aiming at solving such a weakly labelled problem. The presented results demonstrate that this approach can improve the accuracy of feedforward, recurrent, and convolutional neural nets on the 3-class mania classification tasks undertaken on the Bipolar Disorder corpus

    Understanding Electricity-Theft Behavior via Multi-Source Data

    Full text link
    Electricity theft, the behavior that involves users conducting illegal operations on electrical meters to avoid individual electricity bills, is a common phenomenon in the developing countries. Considering its harmfulness to both power grids and the public, several mechanized methods have been developed to automatically recognize electricity-theft behaviors. However, these methods, which mainly assess users' electricity usage records, can be insufficient due to the diversity of theft tactics and the irregularity of user behaviors. In this paper, we propose to recognize electricity-theft behavior via multi-source data. In addition to users' electricity usage records, we analyze user behaviors by means of regional factors (non-technical loss) and climatic factors (temperature) in the corresponding transformer area. By conducting analytical experiments, we unearth several interesting patterns: for instance, electricity thieves are likely to consume much more electrical power than normal users, especially under extremely high or low temperatures. Motivated by these empirical observations, we further design a novel hierarchical framework for identifying electricity thieves. Experimental results based on a real-world dataset demonstrate that our proposed model can achieve the best performance in electricity-theft detection (e.g., at least +3.0% in terms of F0.5) compared with several baselines. Last but not least, our work has been applied by the State Grid of China and used to successfully catch electricity thieves in Hangzhou with a precision of 15% (an improvement form 0% attained by several other models the company employed) during monthly on-site investigation.Comment: 11 pages, 8 figures, WWW'20 full pape
    corecore